Using Distributed Balanced Trees Over DHTs for Building Large-scale Indexes

نویسندگان

  • Nuno Lopes
  • Carlos Baquero
چکیده

DHT systems are structured overlay networks capable of using P2P resources as a scalable storage platform for very large data applications. However, their efficiency expects a level of uniformity in the association of data to index keys that is often not present in inverted indexes. Index data tends to follow non-uniform distributions, often power law distributions, creating intense local storage hotspots and network bottlenecks on specific hosts. Current techniques like caching cannot, alone, cope with this

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Tabu-Based Cache to Improve Range Queries on Prefix Trees

Distributed Hash Tables (DHTs) provide the substrate to build large scale distributed applications over Peerto-Peer networks. A major limitation of DHTs is that they only support exact-match queries. In order to offer range queries over a DHT it is necessary to build additional indexing structures. Prefix-based indexes, such as Prefix Hash Tree (PHT), are interesting approaches for building dis...

متن کامل

Building Inverted Indexes Using Balanced Trees Over DHT Systems

Objects containing the document locations for popular keywords are sufficiently large to create storage hotspots at some hosts. Since each object is assigned to a single key, DHT key based load balancing techniques are incapable of splitting the object through several hosts. Furthermore, caching techniques only reduce network load for query operations and not handling network load during insert...

متن کامل

Building an Internet-Scale Service For Publishing and Locating XML Documents on PlanetLab

In recent years, there has been a growing interest for peer-to-peer (P2P) based computing and applications. One of the important challenges in P2P environments is to quickly locate relevant data across many participating peers. In this regard, Distributed Hash Tables (DHTs) are a popular solution for building large scale distributed applications due to their scalability, load balancing and faul...

متن کامل

A combination of DHTs and Peer Clustering for Distributed Information Retrieval

Distributed Hash Tables (DHTs) are very efficient for querying based on key lookups, if only a small number of keys has to be registered by each individual peer. However, building huge term indexes, as required for IR-style keyword search, are impractical with plain DHTs. Due to the large sizes of document term vocabularies, joining peers cause huge amounts of key inserts, and subsequently larg...

متن کامل

dFault: Fault Localization in Large-Scale Peer-to-Peer Systems

Distributed hash tables (DHTs) have been adopted as a building block for large-scale distributed systems. The upshot of this success is that their robust operation is even more important as missioncritical applications begin to be layered on them. Even though DHTs can detect and heal around unresponsive hosts and disconnected links, several hidden faults and performance bottlenecks go undetecte...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006